16 research outputs found

    ExSurv: A Web Resource for Prognostic Analyses of Exons Across Human Cancers Using Clinical Transcriptomes

    Get PDF
    Survival analysis in biomedical sciences is generally performed by correlating the levels of cellular components with patients' clinical features as a common practice in prognostic biomarker discovery. While the common and primary focus of such analysis in cancer genomics so far has been to identify the potential prognostic genes, alternative splicing - a posttranscriptional regulatory mechanism that affects the functional form of a protein due to inclusion or exclusion of individual exons giving rise to alternative protein products, has increasingly gained attention due to the prevalence of splicing aberrations in cancer transcriptomes. Hence, uncovering the potential prognostic exons can not only help in rationally designing exon-specific therapeutics but also increase specificity toward more personalized treatment options. To address this gap and to provide a platform for rational identification of prognostic exons from cancer transcriptomes, we developed ExSurv (https://exsurv.soic.iupui.edu), a web-based platform for predicting the survival contribution of all annotated exons in the human genome using RNA sequencing-based expression profiles for cancer samples from four cancer types available from The Cancer Genome Atlas. ExSurv enables users to search for a gene of interest and shows survival probabilities for all the exons associated with a gene and found to be significant at the chosen threshold. ExSurv also includes raw expression values across the cancer cohort as well as the survival plots for prognostic exons. Our analysis of the resulting prognostic exons across four cancer types revealed that most of the survival-associated exons are unique to a cancer type with few processes such as cell adhesion, carboxylic, fatty acid metabolism, and regulation of T-cell signaling common across cancer types, possibly suggesting significant differences in the posttranscriptional regulatory pathways contributing to prognosis

    Seten: a tool for systematic identification and comparison of processes, phenotypes, and diseases associated with RNA-binding proteins from condition-specific CLIP-seq profiles

    Get PDF
    RNA-binding proteins (RBPs) control the regulation of gene expression in eukaryotic genomes at post-transcriptional level by binding to their cognate RNAs. Although several variants of CLIP (crosslinking and immunoprecipitation) protocols are currently available to study the global protein-RNA interaction landscape at single-nucleotide resolution in a cell, currently there are very few tools that can facilitate understanding and dissecting the functional associations of RBPs from the resulting binding maps. Here, we present Seten, a web-based and command line tool, which can identify and compare processes, phenotypes, and diseases associated with RBPs from condition-specific CLIP-seq profiles. Seten uses BED files resulting from most peak calling algorithms, which include scores reflecting the extent of binding of an RBP on the target transcript, to provide both traditional functional enrichment as well as gene set enrichment results for a number of gene set collections including BioCarta, KEGG, Reactome, Gene Ontology (GO), Human Phenotype Ontology (HPO), and MalaCards Disease Ontology for several organisms including fruit fly, human, mouse, rat, worm, and yeast. It also provides an option to dynamically compare the associated gene sets across data sets as bubble charts, to facilitate comparative analysis. Benchmarking of Seten using eCLIP data for IGF2BP1, SRSF7, and PTBP1 against their corresponding CRISPR RNA-seq in K562 cells as well as randomized negative controls, demonstrated that its gene set enrichment method outperforms functional enrichment, with scores significantly contributing to the discovery of true annotations. Comparative performance analysis using these CRISPR control data sets revealed significantly higher precision and comparable recall to that observed using ChIP-Enrich. Seten's web interface currently provides precomputed results for about 200 CLIP-seq data sets and both command line as well as web interfaces can be used to analyze CLIP-seq data sets. We highlight several examples to show the utility of Seten for rapid profiling of various CLIP-seq data sets

    Reconstruction of the temporal signaling network in Salmonella-infected human cells

    Get PDF
    Salmonella enterica is a bacterial pathogen that usually infects its host through food sources. Translocation of the pathogen proteins into the host cells leads to changes in the signaling mechanism either by activating or inhibiting the host proteins. Using high-throughput ‘omic’ technologies, changes in the signaling components can be quantified at different levels; however, experimental hits are usually incomplete to represent the whole signaling system as some driver proteins stay hidden within the experimental data. Given that the bacterial infection modifies the response network of the host, more coherent view of the underlying biological processes and the signaling networks can be obtained by using a network modeling approach based on the reverse engineering principles in which a confident region from the protein interactome is found by inferring hits from the omic experiments. In this work, we have used a published temporal phosphoproteomic dataset of Salmonella-infected human cells and reconstructed the temporal signaling network of the human host by integrating the interactome and the phosphoproteomic datasets. We have combined two well-established network modeling frameworks, the Prize-collecting Steiner Forest (PCSF) approach and the Integer Linear Programming (ILP) based edge inference approach. The resulting network conserves the information on temporality, direction of interactions, while revealing hidden entities in the signaling, such as the SNARE binding, mTOR signaling, immune response, cytoskeleton organization, and apoptosis pathways. Targets of the Salmonella effectors in the host cells such as CDC42, RHOA, 14-3-3ή, Syntaxin family, Oxysterol-binding proteins were included in the reconstructed signaling network although they were not present in the initial phosphoproteomic data. We believe that integrated approaches have a high potential for the identification of clinical targets in infectious diseases, especially in the Salmonella infections

    Express: A database of transcriptome profiles encompassing known and novel transcripts across multiple development stages in eye tissues

    Get PDF
    Advances in sequencing have facilitated nucleotide-resolution genome-wide transcriptomic profiles across multiple mouse eye tissues. However, these RNA sequencing (RNA-seq) based eye developmental transcriptomes are not organized for easy public access, making any further analysis challenging. Here, we present a new database “Express” (http://www.iupui.edu/∌sysbio/express/) that unifies various mouse lens and retina RNA-seq data and provides user-friendly visualization of the transcriptome to facilitate gene discovery in the eye. We obtained RNA-seq data encompassing 7 developmental stages of lens in addition to that on isolated lens epithelial and fibers, as well as on 11 developmental stages of retina/isolated retinal rod photoreceptor cells from publicly available wild-type mouse datasets. These datasets were pre-processed, aligned, quantified and normalized for expression levels of known and novel transcripts using a unified expression quantification framework. Express provides heatmap and browser view allowing easy navigation of the genomic organization of transcripts or gene loci. Further, it allows users to search candidate genes and export both the visualizations and the embedded data to facilitate downstream analysis. We identified total of >81,000 transcripts in the lens and >178,000 transcripts in the retina across all the included developmental stages. This analysis revealed that a significant number of the retina-expressed transcripts are novel. Expression of several transcripts in the lens and retina across multiple developmental stages was independently validated by RT-qPCR for established genes such as Pax6 and Lhx2 as well as for new candidates such as Elavl4, Rbm5, Pabpc1, Tia1 and Tubb2b. Thus, Express serves as an effective portal for analyzing pruned RNA-seq expression datasets presently collected for the lens and retina. It will allow a wild-type context for the detailed analysis of targeted gene-knockout mouse ocular defect models and facilitate the prioritization of candidate genes from Exome-seq data of eye disease patients

    Inferring causal molecular networks: empirical assessment through a community-based effort

    Get PDF
    It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense

    Inferring causal molecular networks: empirical assessment through a community-based effort

    Get PDF
    Inferring molecular networks is a central challenge in computational biology. However, it has remained unclear whether causal, rather than merely correlational, relationships can be effectively inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge that focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results constitute the most comprehensive assessment of causal network inference in a mammalian setting carried out to date and suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess the causal validity of inferred molecular networks

    Seten: A tool for systematic identification and comparison of processes, phenotypes and diseases associated with RNA-binding proteins from condition-specific CLIP-seq profiles

    No full text
    RNA-binding proteins (RBPs) control the regulation of gene expression at posttranscriptional level. Several CLIP (crosslinking and immunoprecipitation) protocols are available to study RBPs; however, there are very few tools to understand gene set associations of them. We developed Seten to help identify and compare processes, phenotypes and diseases associated with RBPs. After detection of peaks, they can be given as BED files or gene - score pairs to Seten for this analysis. The peak scores reflect the extent of binding of an RBP on the target transcript, which can be used to do a gene set enrichment analysis. Seten provides a gene set enrichment and functional enrichment methods including pathways, Gene Ontology, phenotypes and diseases gene set collections for a number of organisms. Seten has a web (JavaScript) and command line (Python) interface. Seten web interface provide several visualization options for identification and comparison of results. Seten command line interface can be used to analyze multiple datasets. We also provide precomputed results for more than 200 CLIPseq datasets from CLIPdb and ENCODE peak-detected datasets

    Identification of the Ischemic Pathway Level Changes by Integrating Temporal Phosphoproteome in Ovarian Cancer

    No full text
    The temporal proteome studies aim to evaluate changes in the proteome and phosphoproteome when there is a delay in freezing tissues following tumor excision. Cold ischemia corresponds to the delay time to freezing post-excision. Recently, Carr and his colleagues have shown that ischemia does not affect the global proteome, but the phosphoproteome shows 10% change during ischemia. In this work, we use this published phosphoproteomic dataset derived from tumors of four ovarian cancer patients where the samples are frozen without any delay and with a delay of 5, 30 and 60 minutes [1]. We first analyzed the post-excision phosphorylation changes across different patients and showed that the ischemic effect is very heterogeneous at phosphoproteomic level. Then, we reconstructed patient-specific network models by inferring the confidence weighted interactome and phosphoproteome. For this purpose, we use the Omics Integrator software which solves the prize-collecting Steiner forest (PCSF) problem [2]. On the one hand, it tries to include as many of the proteomic hits as possible; on the other hand, it tries to keep the network small by avoiding using unreliable protein-protein interactions. Using these networks we are able to track enriched biological processes across time points. For example, many biological processes are significantly enriched in any tumors; i.e. apoptosis, cellular response to stress. We are also able to extract the biological processes that are significantly enriched at any time point and in any tumor. Next, we aligned the reconstructed patient-specific networks and identified common and unique pathways across these patients as well as important subnetworks. We also compared these networks based on the functional enrichments to identify the biological processes and pathways that play role in different patients. The results of this study provide us information about the ischemic pathways and proteins besides the cancer related pathways. Because in many studies including TCGA the excision time is not known, the benchmark ischemic pathways and proteins will make us be able to distinguish cancer-related pathways and ischemic artefacts

    Transcriptome analysis of developing lens reveals abundance of novel transcripts and extensive splicing alterations

    Get PDF
    Abstract Lens development involves a complex and highly orchestrated regulatory program. Here, we investigate the transcriptomic alterations and splicing events during mouse lens formation using RNA-seq data from multiple developmental stages, and construct a molecular portrait of known and novel transcripts. We show that the extent of novelty of expressed transcripts decreases significantly in post-natal lens compared to embryonic stages. Characterization of novel transcripts into partially novel transcripts (PNTs) and completely novel transcripts (CNTs) (novelty score ≄ 70%) revealed that the PNTs are both highly conserved across vertebrates and highly expressed across multiple stages. Functional analysis of PNTs revealed their widespread role in lens developmental processes while hundreds of CNTs were found to be widely expressed and predicted to encode for proteins. We verified the expression of four CNTs across stages. Examination of splice isoforms revealed skipped exon and retained intron to be the most abundant alternative splicing events during lens development. We validated by RT-PCR and Sanger sequencing, the predicted splice isoforms of several genes Banf1, Cdk4, Cryaa, Eif4g2, Pax6, and Rbm5. Finally, we present a splicing browser Eye Splicer ( http://www.iupui.edu/~sysbio/eye-splicer/ ), to facilitate exploration of developmentally altered splicing events and to improve understanding of post-transcriptional regulatory networks during mouse lens development

    M4B: A novel method for designing and ordering of the genetic devices

    No full text
    In synthetic biology, designing a new genetic construct demands in-detail studies of its candidate components individually and in a composition with each other. These costly wet lab experiments require considerable amount of time and usually result in undesired output. In this paper, we propose a method for the extraction of existing or novel synthetic devices from the available biological parts or devices from iGEMs BioParts Registry and ordered the resulting devices based on their computed reliabilities. This method is very efficient and it helps the wetlab biologists in designing their genetic devices based on the given input and output in a reasonable amount of time. This method is implemented in "Mining for BioBricks" (M4B), a web-based application that facilitates the prediction of novel genetically made devices based on the given input and output. © 2012 IEEE
    corecore